AITopics

Country: Asia > China (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsApr-24-2026, 16:30:52 GMT

KS-GNN: Keywords Search over Incomplete Graphs via Graph Neural Network

Keyword search is a fundamental task to retrieve information that is the most relevant to the query keywords. Keyword search over graphs aims to find subtrees or subgraphs containing all query keywords ranked according to some criteria. Existing studies all assume that the graphs have complete information. However, real-world graphs may contain some missing information (such as edges or keywords), thus making the problem much more challenging. To solve the problem of keyword search over incomplete graphs, we propose a novel model named KS-GNN based on the graph neural network and the auto-encoder. By considering the latent relationships and the frequency of different keywords, the proposed KS-GNN aims to alleviate the effect of missing information and is able to learn low-dimensional representative node embeddings that preserve both graph structure and keyword features. Our model can effectively answer keyword search queries with linear time complexity over incomplete graphs. The experiments on four real-world datasets show that our model consistently achieves better performance than state-of-the-art baseline methods in graphs having missing information.

information retrieval, machine learning, natural language, (17 more...)

Country: Asia > China (0.46)

Genre: Research Report (0.66)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsFeb-7-2026, 11:39:27 GMT

0d7363894acdee742caf7fe4e97c4d49-Paper.pdf

graph, information, keyword, (13 more...)

Country:

Oceania > Australia > New South Wales (0.04)
Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
(2 more...)

Neural Information Processing SystemsDec-23-2025, 18:28:22 GMT

KS-GNN: Keywords Search over Incomplete Graphs via Graphs Neural Network

Keyword search is a fundamental task to retrieve information that is the most relevant to the query keywords. Keyword search over graphs aims to find subtrees or subgraphs containing all query keywords ranked according to some criteria. Existing studies all assume that the graphs have complete information. However, real-world graphs may contain some missing information (such as edges or keywords), thus making the problem much more challenging. To solve the problem of keyword search over incomplete graphs, we propose a novel model named KS-GNN based on the graph neural network and the auto-encoder. By considering the latent relationships and the frequency of different keywords, the proposed KS-GNN aims to alleviate the effect of missing information and is able to learn low-dimensional representative node embeddings that preserve both graph structure and keyword features. Our model can effectively answer keyword search queries with linear time complexity over incomplete graphs. The experiments on four real-world datasets show that our model consistently achieves better performance than state-of-the-art baseline methods in graphs having missing information.

incomplete graph, information, keyword search, (5 more...)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.63)

Chouhan, Ashish, Mandour, Saifeldin, Gertz, Michael

ClusterTalk: Corpus Exploration Framework using Multi-Dimensional Exploratory Search

arXiv.org Artificial IntelligenceDec-19-2024

Exploratory search of large text corpora is essential in domains like biomedical research, where large amounts of research literature are continuously generated. This paper presents ClusterTalk (The demo video and source code are available at: https://github.com/achouhan93/ClusterTalk), a framework for corpus exploration using multi-dimensional exploratory search. Our system integrates document clustering with faceted search, allowing users to interactively refine their exploration and ask corpus and document-level queries. Compared to traditional one-dimensional search approaches like keyword search or clustering, this system improves the discoverability of information by encouraging a deeper interaction with the corpus. We demonstrate the functionality of the ClusterTalk framework based on four million PubMed abstracts for the four-year time frame.

information retrieval, large language model, machine learning, (18 more...)

2412.14533

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Heidelberg (0.04)

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area (0.32)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Neural Information Processing SystemsOct-9-2024, 12:22:05 GMT

KS-GNN: Keywords Search over Incomplete Graphs via Graphs Neural Network

Keyword search is a fundamental task to retrieve information that is the most relevant to the query keywords. Keyword search over graphs aims to find subtrees or subgraphs containing all query keywords ranked according to some criteria. Existing studies all assume that the graphs have complete information. However, real-world graphs may contain some missing information (such as edges or keywords), thus making the problem much more challenging. To solve the problem of keyword search over incomplete graphs, we propose a novel model named KS-GNN based on the graph neural network and the auto-encoder. By considering the latent relationships and the frequency of different keywords, the proposed KS-GNN aims to alleviate the effect of missing information and is able to learn low-dimensional representative node embeddings that preserve both graph structure and keyword features.

graph neural network, incomplete graph, information, (3 more...)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.99)

Ahluwalia, Aman, Sutradhar, Bishwajit, Ghosh, Karishma, Yadav, Indrapal, Sheetal, Arpan, Patil, Prashant

Hybrid Semantic Search: Unveiling User Intent Beyond Keywords

arXiv.org Artificial IntelligenceSep-6-2024

At its core, semantic search hinges This paper addresses the limitations of on two crucial components. The first, the search traditional keyword-based search in function, acts similarly to traditional search understanding user intent and introduces a engines [1] by identifying and ranking novel hybrid search approach that leverages documents relevant to a user's query within a the strengths of non-semantic search engines, vast collection of information (corpus). However, Large Language Models (LLMs), and semantic search goes beyond this basic embedding models. The proposed system functionality with its second component: integrates keyword matching, semantic vector semantic understanding. This is where embeddings, and LLM-generated structured Transformers come into play, allowing the queries to deliver highly relevant and system to delve deeper than keyword matching.

query, retrieval, semantic search, (12 more...)

2408.09236

Country: Asia > India > Maharashtra > Pune (0.04)

Genre: Research Report (0.43)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.49)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)

arXiv.org Artificial IntelligenceAug-10-2024

Path-LLM: A Shortest-Path-based LLM Learning for Unified Graph Representation

Shang, Wenbo, Zhu, Xuliang, Huang, Xin

Unified graph representation learning aims to produce node embeddings, which can be applied to multiple downstream applications. However, existing studies based on graph neural networks and language models either suffer from the limitations of numerous training needed toward specific downstream predictions or have shallow semantic features. In this work, we propose a novel Path-LLM model to learn unified graph representation, which leverages a powerful large language model (LLM) to incorporate our proposed path features. Our Path-LLM framework consists of several well-designed techniques. First, we develop a new mechanism of long-to-short shortest path (L2SP) selection, which covers essential connections between different dense groups. An in-depth comparison of different path selection plans is offered to illustrate the strength of our designed L2SP. Then, we design path textualization to obtain L2SP-based training texts. Next, we feed the texts into a self-supervised LLM training process to learn embeddings. Extensive experiments on benchmarks validate the superiority of Path-LLM against the state-of-the-art WalkLM method on two classical graph learning tasks (node classification and link prediction) and one NP-hard graph query processing task (keyword search), meanwhile saving more than 90% of training paths.

graph, path-llm, shortest path, (12 more...)

2408.05456

Country:

North America > United States > District of Columbia > Washington (0.05)
Asia > China > Hong Kong (0.04)
Asia > Singapore (0.04)
(5 more...)

Genre:

Research Report (1.00)
Overview (0.93)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.94)
Health & Medicine > Therapeutic Area > Immunology (0.70)
Health & Medicine > Epidemiology (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Yusuf, Bolaji, Černocký, Jan "Honza", Saraçlar, Murat

Pretraining End-to-End Keyword Search with Automatically Discovered Acoustic Units

arXiv.org Artificial IntelligenceJul-5-2024

End-to-end (E2E) keyword search (KWS) has emerged as an alternative and complimentary approach to conventional keyword search which depends on the output of automatic speech recognition (ASR) systems. While E2E methods greatly simplify the KWS pipeline, they generally have worse performance than their ASR-based counterparts, which can benefit from pretraining with untranscribed data. In this work, we propose a method for pretraining E2E KWS systems with untranscribed data, which involves using acoustic unit discovery (AUD) to obtain discrete units for untranscribed data and then learning to locate sequences of such units in the speech. We conduct experiments across languages and AUD systems: we show that finetuning such a model significantly outperforms a model trained from scratch, and the performance improvements are generally correlated with the quality of the AUD system used for pretraining.

acoustic unit, query, unit discovery, (13 more...)

2407.04652

Country:

South America > Colombia > Meta Department > Villavicencio (0.04)
Europe > Czechia > South Moravian Region > Brno (0.04)
Asia > Middle East > Republic of Türkiye (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
(2 more...)

Yusuf, Bolaji, Saraçlar, Murat

Written Term Detection Improves Spoken Term Detection

arXiv.org Artificial IntelligenceJul-5-2024

End-to-end (E2E) approaches to keyword search (KWS) are considerably simpler in terms of training and indexing complexity when compared to approaches which use the output of automatic speech recognition (ASR) systems. This simplification however has drawbacks due to the loss of modularity. In particular, where ASR-based KWS systems can benefit from external unpaired text via a language model, current formulations of E2E KWS systems have no such mechanism. Therefore, in this paper, we propose a multitask training objective which allows unpaired text to be integrated into E2E KWS without complicating indexing and search. In addition to training an E2E KWS model to retrieve text queries from spoken documents, we jointly train it to retrieve text queries from masked written documents. We show empirically that this approach can effectively leverage unpaired text for KWS, with significant improvements in search performance across a wide variety of languages. We conduct analysis which indicates that these improvements are achieved because the proposed method improves document representations for words in the unpaired text. Finally, we show that the proposed method can be used for domain adaptation in settings where in-domain paired data is scarce or nonexistent.

encoder, query, unpaired text, (16 more...)